Evaluating Data Mining Models: A Pattern Language
نویسندگان
چکیده
This paper extracts and documents patterns that identify recurring solutions for the problem of evaluation of data mining models. The five patterns presented in this paper are organized as a pattern language. The patterns differ in their context of application and how they solve the evaluation problem, especially when only limited amounts of data are available. Another contribution of this paper is the introduction of a new pattern section called “Force Resolution Map”. We believe that Force Resolution Maps illuminate not only these data mining patterns, but are generally useful in explicating any patterns.
منابع مشابه
Pattern Directed Mining of Sequence Data
Sequence data arise naturally in many applications, and can be viewed as an ordering of events, where each event has an associated time of occurrence. An important characteristic of event sequences is the occurrence of episodes, i.e. a collection of events occurring in a certain pattern. Of special interest axe ~r~uent episodes, i.e. episodes occurring with a frequency above a certain threshold...
متن کاملFinding Sequential Patterns from Large Sequence Data
Data mining is the task of discovering interesting patterns from large amounts of data. There are many data mining tasks, such as classification, clustering, association rule mining, and sequential pattern mining. Sequential pattern mining finds sets of data items that occur together frequently in some sequences. Sequential pattern mining, which extracts frequent subsequences from a sequence da...
متن کاملQuery Languages Supporting Descriptive Rule Mining: A Comparative Study
Recently, inductive databases (IDBs) have been proposed to tackle the problem of knowledge discovery from huge databases. With an IDB, the user/analyst performs a set of very different operations on data using a query language, powerful enough to support all the required manipulations, such as data preprocessing, pattern discovery and pattern post-processing. We provide a comparison between thr...
متن کاملMining and Using Sets of Patterns through Compression
In this chapter we describe how to successfully apply the MDL principle to pattern mining. In particular, we discuss how pattern-based models can be designed and induced by means of compression, resulting in succinct and characteristic descriptions of the data. As motivation, we argue that traditional pattern mining asks the wrong question: instead of asking for all patterns satisfying some int...
متن کاملData Mining
Data Mining provides approaches for the identification and discovery of non-trivial patterns and models hidden in large collections of data. In the applied natural language processing domain, data mining usually requires preprocessed data that has been extracted from textual documents. Additionally, this data is often integrated with other data sources. This chapter provides an overview on data...
متن کامل